Skip to content

feat: cache versioned kubelet kubectl package binaries#8287

Merged
awesomenix merged 1 commit intomainfrom
nishp/fastkubeletfrompkg
Apr 15, 2026
Merged

feat: cache versioned kubelet kubectl package binaries#8287
awesomenix merged 1 commit intomainfrom
nishp/fastkubeletfrompkg

Conversation

@awesomenix
Copy link
Copy Markdown
Contributor

What this does

This change avoids unnecessary kubelet/kubectl package installation work during CSE when the corresponding binaries are already
available on the VHD.

Today, Ubuntu and Mariner/Azure Linux cache the kubelet/kubectl PMC packages on the VHD, but CSE still installs them via the package
manager/runtime package flow. That has a few downsides:

  • it re-runs package installation logic during provisioning
  • on Ubuntu, dpkg -i triggers package postinst behavior for kubelet.service
  • we pay extra provisioning latency even though AKS already owns kubelet service configuration and startup

This PR changes the flow so VHD build materializes versioned kubelet/kubectl binaries from the cached package artifacts into:

  • /opt/bin/kubelet-<k8sVersion>
  • /opt/bin/kubectl-<k8sVersion>

Then, during CSE, if the requested version is already present in /opt/bin, we reuse the existing cache-first path and do the final
rename into place instead of reinstalling from the package.

Changes

VHD build

  • extend vhdbuilder/packer/install-dependencies.sh so cached kubelet/kubectl package artifacts also produce versioned binaries
    under /opt/bin
  • support both:
    • Ubuntu: extract /usr/bin/<tool> from cached .deb
    • Mariner/Azure Linux: extract /usr/bin/<tool> from cached .rpm

CSE

  • add a shared helper to detect whether versioned kubelet/kubectl binaries already exist in /opt/bin
  • update Ubuntu and Mariner package-based install paths to:
    • use the cached versioned binaries when available
    • fall back to existing package install behavior when the cache is not present
    • continue respecting SHOULD_ENFORCE_KUBE_PMC_INSTALL=true

Tests

  • add/extend coverage for the cache-first kubelet/kubectl flow
  • extend Linux VHD content validation to verify package-backed kubelet/kubectl versions also materialize versioned binaries in
    /opt/bin

Why

This keeps kubelet/kubectl aligned with the existing kubernetes-binaries flow:

  • cache versioned binaries on the image
  • do a cheap final move during provisioning

Expected benefits:

  • lower CSE latency
  • avoid unnecessary package-manager work during provisioning
  • avoid Ubuntu kubelet package postinst side effects when the VHD already has the requested binary

Notes

  • this does not remove the existing package fallback path
  • if the versioned binary is missing, CSE still falls back to the current package installation behavior
  • SHOULD_ENFORCE_KUBE_PMC_INSTALL=true still forces the package path for validation / test scenarios

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes Linux node provisioning by reusing kubelet/kubectl binaries already cached on the VHD (materialized as versioned files under /opt/bin) instead of reinstalling via the package manager during CSE.

Changes:

  • VHD build: extract kubelet/kubectl binaries from cached .deb/.rpm artifacts into /opt/bin/<tool>-<k8sVersion>.
  • CSE: add shared helpers to detect/move cached versioned kube binaries; update Ubuntu and Mariner package-based kubelet/kubectl install paths to prefer cache unless SHOULD_ENFORCE_KUBE_PMC_INSTALL=true.
  • Tests: extend VHD content validation and add/extend ShellSpec coverage for the cache-first behavior.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
vhdbuilder/packer/install-dependencies.sh Adds VHD-build extraction of kubelet/kubectl binaries from cached package artifacts into versioned /opt/bin paths.
vhdbuilder/packer/test/linux-vhd-content-test.sh Extends VHD content tests to validate versioned package-backed kubelet/kubectl binaries exist and match expected versions.
parts/linux/cloud-init/artifacts/cse_install.sh Adds shared helpers to detect and move cached versioned kubelet/kubectl binaries; reuses them in the URL-based install flow.
parts/linux/cloud-init/artifacts/ubuntu/cse_install_ubuntu.sh Updates Ubuntu package-based kubelet/kubectl install path to prefer cached versioned binaries.
parts/linux/cloud-init/artifacts/mariner/cse_install_mariner.sh Updates Mariner/AzureLinux package-based kubelet/kubectl install path to prefer cached versioned binaries.
spec/parts/linux/cloud-init/artifacts/cse_install_ubuntu_spec.sh Adds ShellSpec coverage for Ubuntu cache-first kubelet/kubectl install behavior.
spec/parts/linux/cloud-init/artifacts/cse_install_mariner_spec.sh Adds ShellSpec coverage for Mariner cache-first kubelet/kubectl install behavior.

Comment thread spec/parts/linux/cloud-init/artifacts/cse_install_ubuntu_spec.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/cse_install_mariner_spec.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/ubuntu/cse_install_ubuntu.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/mariner/cse_install_mariner.sh Outdated
Comment thread vhdbuilder/packer/install-dependencies.sh
@awesomenix awesomenix force-pushed the nishp/fastkubeletfrompkg branch from 98894d6 to 9614c74 Compare April 12, 2026 15:27
@awesomenix awesomenix changed the title Cache versioned kubelet kubectl package binaries feat: cache versioned kubelet kubectl package binaries Apr 12, 2026
Copilot AI review requested due to automatic review settings April 12, 2026 15:34
@awesomenix awesomenix force-pushed the nishp/fastkubeletfrompkg branch from 9614c74 to edbcee6 Compare April 12, 2026 15:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment thread vhdbuilder/packer/test/linux-vhd-content-test.sh Outdated
Comment thread vhdbuilder/packer/test/linux-vhd-content-test.sh Outdated
@awesomenix awesomenix force-pushed the nishp/fastkubeletfrompkg branch from edbcee6 to b1dd3e8 Compare April 12, 2026 17:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Comment on lines +434 to +436
mv "${sourceBinary}" "${targetPath}"
chown root:root "${targetPath}"
chmod 0755 "${targetPath}"
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extractRPMBinaryFromFile assigns the extracted binary to binaryPath, but later tries to move ${sourceBinary}, which is undefined. This will fail at runtime and break kubelet/kubectl (and credential-provider) extraction. Replace ${sourceBinary} with the selected ${binaryPath} (and consider using install -m0755 to avoid cross-filesystem mv edge cases).

Suggested change
mv "${sourceBinary}" "${targetPath}"
chown root:root "${targetPath}"
chmod 0755 "${targetPath}"
install -m 0755 "${binaryPath}" "${targetPath}"

Copilot uses AI. Check for mistakes.
Comment on lines 1074 to +1083
fallbackToKubeBinaryInstall() {
packageName="${1:-}"
packageVersion="${2:-}"
local targetPath="${3:-/opt/bin/${packageName}}"
if [ "${packageName}" = "kubelet" ] || [ "${packageName}" = "kubectl" ]; then
if [ "${SHOULD_ENFORCE_KUBE_PMC_INSTALL}" = "true" ]; then
echo "Kube PMC install is enforced, skipping fallback to kube binary install for ${packageName}"
return 1
elif [ -f "/opt/bin/${packageName}-${packageVersion}" ]; then
mv "/opt/bin/${packageName}-${packageVersion}" "/opt/bin/${packageName}"
chmod a+x /opt/bin/${packageName}
rm -rf /opt/bin/${packageName}-* &
mv "/opt/bin/${packageName}-${packageVersion}" "${targetPath}"
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache file naming on the VHD is /opt/bin/<tool>-<k8sVersion> (version stripped to X.Y.Z), but this helper looks for /opt/bin/<tool>-<packageVersion> verbatim. For RPMs, packageVersion may include a release suffix (e.g., 1.34.0-5.azl3), causing the cache-first path to be skipped even when /opt/bin/kubelet-1.34.0 exists. Normalize packageVersion to the intended <k8sVersion> (strip epoch and anything after the first -) and/or check both possible filenames.

Copilot uses AI. Check for mistakes.
Comment on lines 213 to +216
echo "installing azure-acr-credential-provider package version: $packageVersion"
mkdir -p "${CREDENTIAL_PROVIDER_BIN_DIR}"
chown -R root:root "${CREDENTIAL_PROVIDER_BIN_DIR}"
installRPMPackageFromFile "azure-acr-credential-provider" "${packageVersion}" || exit $ERR_CREDENTIAL_PROVIDER_DOWNLOAD_TIMEOUT
ln -snf /usr/bin/azure-acr-credential-provider "$CREDENTIAL_PROVIDER_BIN_DIR/acr-credential-provider"
installRPMPackageFromFile "azure-acr-credential-provider" "${packageVersion}" "${CREDENTIAL_PROVIDER_BIN_DIR}/acr-credential-provider" || exit "$ERR_CREDENTIAL_PROVIDER_DOWNLOAD_TIMEOUT"
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This switches azure-acr-credential-provider from an RPM install (which would ensure runtime dependencies are present) to “extract the binary from the RPM.” If the binary isn’t fully static or relies on RPM-triggered setup, this can break at runtime. If the intent is only to optimize kubelet/kubectl, keep credential-provider on the RPM install path; otherwise, add validation (e.g., dependency checks) and tests to guarantee the extracted binary works without installing the RPM.

Copilot uses AI. Check for mistakes.
Comment on lines +976 to +977
# shellcheck disable=SC3010
if [[ ! ${versionOutput} =~ ${k8sVersion} ]]; then
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using [[ ... =~ ${k8sVersion} ]] treats k8sVersion as a regex. Since Kubernetes versions contain dots, . will match any character, making this check more permissive than intended and potentially hiding mismatches. Prefer a fixed-string check (e.g., grep -F) or escape regex metacharacters before using =~.

Suggested change
# shellcheck disable=SC3010
if [[ ! ${versionOutput} =~ ${k8sVersion} ]]; then
if ! printf '%s\n' "${versionOutput}" | grep -Fq -- "${k8sVersion}"; then

Copilot uses AI. Check for mistakes.
}
function logs_to_events() {
echo "$2"
return 0
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specs assert that installRPMPackageFromFile would call extractRPMBinaryFromFile, but logs_to_events is stubbed to only echo the command string and does not execute it. This means runtime issues inside extractRPMBinaryFromFile (e.g., the current undefined sourceBinary bug) won’t be caught by tests. Update the stub to execute the provided command (e.g., eval "$2") and assert on resulting filesystem effects, or directly unit-test extractRPMBinaryFromFile with mocked rpm2cpio/cpio.

Suggested change
return 0
eval "$2"

Copilot uses AI. Check for mistakes.
Comment thread vhdbuilder/packer/install-dependencies.sh
Comment thread vhdbuilder/packer/test/linux-vhd-content-test.sh
Comment thread parts/linux/cloud-init/artifacts/mariner/cse_install_mariner.sh
Comment thread parts/linux/cloud-init/artifacts/mariner/cse_install_mariner.sh
Comment thread parts/linux/cloud-init/artifacts/mariner/cse_install_mariner.sh Outdated
Comment thread e2e/node_config.go Outdated
@awesomenix awesomenix force-pushed the nishp/fastkubeletfrompkg branch from 4ed5d32 to 9defd9e Compare April 14, 2026 19:57
Copilot AI review requested due to automatic review settings April 14, 2026 20:37
@awesomenix awesomenix force-pushed the nishp/fastkubeletfrompkg branch from 9defd9e to 8aa484b Compare April 14, 2026 20:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.


# query all package versions and get the latest version for matching k8s version
# e.g. 1.34.0-5.azl3
fullPackageVersion=$(dnf list ${packageName} --showduplicates | grep ${desiredVersion}- | awk '{print $2}' | sort -V | tail -n 1)
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unquoted ${packageName} and ${desiredVersion} can cause word-splitting/globbing and also make grep interpret version characters as regex metacharacters. Quote variables and consider grep -F -- \"${desiredVersion}-\" (fixed string) to avoid regex surprises and reduce injection risk from unexpected input.

Copilot uses AI. Check for mistakes.
Comment on lines +244 to +248
if [ "$OS" = "$UBUNTU_OS_NAME" ] || [ "$OS" = "$MARINER_OS_NAME" ]; then
testVersionedKubernetesPackageBinariesPresent "${name}" "${PACKAGE_VERSIONS[@]}"
else
echo "Skipping testVersionedKubernetesPackageBinariesPresent for ${OS}${OS_VARIANT:+ ${OS_VARIANT}}"
fi
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VHD build path extracts versioned kubelet/kubectl binaries for Mariner and Azure Linux (isMarinerOrAzureLinux), but this validation only runs for Ubuntu/Mariner. Expanding the condition to include Azure Linux (and relevant variants, if applicable) would ensure the new cache behavior is covered on all intended distros.

Copilot uses AI. Check for mistakes.
Comment thread parts/linux/cloud-init/artifacts/mariner/cse_install_mariner.sh Outdated
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 15, 2026 17:13
@awesomenix awesomenix force-pushed the nishp/fastkubeletfrompkg branch from 5333f1d to bf6eff0 Compare April 15, 2026 17:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

parts/linux/cloud-init/artifacts/ubuntu/cse_install_ubuntu.sh:1

  • The PR description says CSE should “fall back to existing package install behavior when the cache is not present” and that SHOULD_ENFORCE_KUBE_PMC_INSTALL=true still forces the “package path”. After this change, the non-cached path no longer installs packages via the package manager (it downloads the .deb then extracts only /usr/bin/<tool>), so it won’t execute the existing install behavior (and may skip package-provided dependencies/metadata). Recommendation (mandatory): keep the “cached versioned binary” shortcut, but when the versioned binary is not present (or when enforcement is enabled), revert to the prior dpkg/apt-based install path (or explicitly document and implement dependency handling if extraction is intended to replace package install).
#!/bin/bash


debFile="${downloadDir}/${debFile}"
logs_to_events "AKS.CSE.install${packageName}.installDebPackageFromFile" "installDebPackageFromFile ${debFile}" || exit $ERR_APT_INSTALL_TIMEOUT
logs_to_events "AKS.CSE.install${packageName}.extractDebBinaryFromFile" "extractDebBinaryFromFile ${debFile} ${packageName} ${targetPath}" || exit "$ERR_APT_INSTALL_TIMEOUT"
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says CSE should “fall back to existing package install behavior when the cache is not present” and that SHOULD_ENFORCE_KUBE_PMC_INSTALL=true still forces the “package path”. After this change, the non-cached path no longer installs packages via the package manager (it downloads the .deb then extracts only /usr/bin/<tool>), so it won’t execute the existing install behavior (and may skip package-provided dependencies/metadata). Recommendation (mandatory): keep the “cached versioned binary” shortcut, but when the versioned binary is not present (or when enforcement is enabled), revert to the prior dpkg/apt-based install path (or explicitly document and implement dependency handling if extraction is intended to replace package install).

Copilot uses AI. Check for mistakes.
Comment on lines +303 to +306
elif isMarinerOrAzureLinux "$OS"; then
local rpm_file

rpm_file=$(find "${download_dir}" -maxdepth 1 -name "${package_name}-${version_no_epoch}*" -print -quit 2>/dev/null) || rpm_file=""
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RPM extraction pipeline is not robust when the binary exists in only one of the candidate paths. Passing both ./usr/bin/... and ./usr/local/bin/... to a single cpio --to-stdout invocation can fail (or yield empty stdout) depending on which path exists, and the pipeline doesn’t explicitly validate a non-empty extraction before installing. Recommendation (mandatory): extract to a temp dir (like the Mariner CSE helper does) and then select the first existing candidate, or run cpio --to-stdout for one path and fall back to the other if it fails, ensuring failures are detected (ideally with set -o pipefail scoped locally if not globally enabled).

Copilot uses AI. Check for mistakes.
rm -rf "${extractDir}"
}

installPkgWithAptGet() {
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

installPkgWithAptGet no longer installs a package with apt/dpkg; it now (a) optionally uses the versioned binary fallback, otherwise (b) locates/downloads a .deb and extracts/moves a single binary out of it. Recommendation (mandatory): rename/split the function to reflect the new behavior (e.g., installBinaryFromDeb / extractDebBinaryFromFile + a separate “download deb” step), and keep the old name only if it still performs apt-based installation.

Copilot uses AI. Check for mistakes.

chmod a+x "${binaryPath}"
if [ "${packageName}" = "kubectl" ]; then
versionOutput=$("${binaryPath}" version 2>/dev/null)
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubectl version may attempt to contact an API server by default, which can cause the VHD content test to fail or hang in environments without cluster connectivity. Recommendation (mandatory): use a client-only invocation (e.g., kubectl version --client or an equivalent client-only flag supported by the targeted kubectl versions) to make this test deterministic.

Suggested change
versionOutput=$("${binaryPath}" version 2>/dev/null)
versionOutput=$("${binaryPath}" version --client 2>/dev/null)

Copilot uses AI. Check for mistakes.
Comment on lines +411 to +422
local binaryPath=""

extractDir=$(mktemp -d) || return 1
if ! (cd "${extractDir}" && rpm2cpio "${rpmFile}" | cpio -idm >/dev/null 2>&1); then
rm -rf "${extractDir}"
return 1
fi

for candidate in "${extractDir}/usr/bin/${packageName}" "${extractDir}/usr/local/bin/${packageName}"; do
if [ -f "${candidate}" ]; then
binaryPath="${candidate}"
break
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracting archives with cpio -idm without hardening flags can be risky if the RPM payload is ever compromised (e.g., paths that try to escape the extraction directory, unexpected file types, ownership preservation). Recommendation (moderate): add safer cpio flags (such as disabling absolute filenames and ownership preservation where supported) and/or validate the extracted path before moving the binary, mirroring best practices for safely unpacking archives even into a temp dir.

Suggested change
local binaryPath=""
extractDir=$(mktemp -d) || return 1
if ! (cd "${extractDir}" && rpm2cpio "${rpmFile}" | cpio -idm >/dev/null 2>&1); then
rm -rf "${extractDir}"
return 1
fi
for candidate in "${extractDir}/usr/bin/${packageName}" "${extractDir}/usr/local/bin/${packageName}"; do
if [ -f "${candidate}" ]; then
binaryPath="${candidate}"
break
local resolvedExtractDir
local binaryPath=""
local resolvedCandidate=""
extractDir=$(mktemp -d) || return 1
resolvedExtractDir=$(readlink -f "${extractDir}") || {
rm -rf "${extractDir}"
return 1
}
if ! (cd "${extractDir}" && rpm2cpio "${rpmFile}" | cpio -idm --no-absolute-filenames --no-preserve-owner >/dev/null 2>&1); then
rm -rf "${extractDir}"
return 1
fi
for candidate in "${extractDir}/usr/bin/${packageName}" "${extractDir}/usr/local/bin/${packageName}"; do
if [ -f "${candidate}" ]; then
resolvedCandidate=$(readlink -f "${candidate}") || continue
case "${resolvedCandidate}" in
"${resolvedExtractDir}"/*)
binaryPath="${resolvedCandidate}"
break
;;
esac

Copilot uses AI. Check for mistakes.
Comment thread e2e/toolkit/k8s.go
Comment on lines +7 to +17
func CheckK8sConstraint(kubernetesVersion string, constraintStr string) (bool, error) {
version, err := semver.NewVersion(kubernetesVersion)
if err != nil {
return false, err
}
constraint, err := semver.NewConstraint(constraintStr)
if err != nil {
return false, err
}
return constraint.Check(version), nil
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CheckK8sConstraint is exported but has no GoDoc comment, which commonly fails repository linting and makes its intended input format (e.g., whether leading v is allowed) unclear. Recommendation (nit): add a short doc comment describing expected version formatting and what the constraint represents.

Copilot uses AI. Check for mistakes.
@awesomenix awesomenix merged commit 320d4b0 into main Apr 15, 2026
30 of 39 checks passed
@awesomenix awesomenix deleted the nishp/fastkubeletfrompkg branch April 15, 2026 18:34
calvin197 added a commit that referenced this pull request Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants